Battery-free wireless imaging of underwater environments

Imaging underwater environments is of great importance to marine sciences, sustainability, climatology, defense, robotics, geology, space exploration, and food security. Despite advances in underwater imaging, most of the ocean and marine organisms remain unobserved and undiscovered. Existing methods for underwater imaging are unsuitable for scalable, long-term, in situ observations because they require tethering for power and communication. Here we describe underwater backscatter imaging, a method for scalable, real-time wireless imaging of underwater environments using fully-submerged battery-free cameras. The cameras power up from harvested acoustic energy, capture color images using ultra-low-power active illumination and a monochrome image sensor, and communicate wirelessly at net-zero-power via acoustic backscatter. We demonstrate wireless battery-free imaging of animals, plants, pollutants, and localization tags in enclosed and open-water environments. The method’s self-sustaining nature makes it desirable for massive, continuous, and long-term ocean deployments with many applications including marine life discovery, submarine surveillance, and underwater climate change monitoring.

The electrical components of the design include a PCB (designed using a freely available software (Eagle, Autodesk) sent for fabrication to a commercial vendor (EasyPCBUSA, Sun Circuits)), an FPGA (IGLOO nano AGLN060, Microsemi), a CMOS monochrome camera sensor (HM01B0, HiMax), a camera connector (609-4320-2-ND, Digikey), two oscillators of 32 kHz A function generator (SGD 1032x, Siglent) connected to a fabricated piezoelectric transducer (fabrication procedure described above) through an audio amplifier (XLi 3500, Crown) is used as an underwater projector to transmit acoustic signals. An acoustic hydrophone (H2A, Aquarian) is used as a remote receiver to measure underwater sound. The hydrophone is connected to a laptop (XPS 15 7590, Dell), which records sound using an open-source audio recording software (Audacity) at a sampling rate of 192,000 samples/sec. The signal processing and decoding algorithms are implemented in MATLAB R2020b (Mathworks). The FPGA program is designed using a freely available IDE (Libero SoC v11.9, Microsemi), and the IDE-generated programming file is flashed on the FPGA using a programmer kit (FlashPro 3, Microsemi).

Evaluation and Testing
The batteryless camera prototype was evaluated qualitatively and quantitatively in enclosed and open water environments.

Enclosed Water Testing Environments
Imaging: Testing in controlled environments was performed in an enclosed water tank with a depth of 1.5 m and rectangular cross section of 3 m x 4 m ( Supplementary Fig. 6). Here, the projector, hydrophone, and the two transducers of the batteryless camera (for harvesting and backscatter) were all submerged at a depth of 75 cm below the water surface. At the same time, the domes housing the camera and illumination (which are connected to the two transducers using wires) were placed along with the underwater objects in a separate tank to isolate them and control environmental conditions including lighting and nutrient levels. Specifically, the coral reef model and the Protoreaster linckii were co-located with the camera at the base of a smaller tank with a depth of 40 cm and a rectangular cross-section of 40 cm x 50 cm ( Supplementary Fig. 6). Similarly, several seeds of Aponogeton ulvaceus were planted in freshwater aquarium substrate in a third tank with the same dimensions (40 cm x 50 cm x 40 cm), and the camera was used to monitor their growth over a period of one week. Images in Fig. 2b, Fig. 3c, and Fig. 3d of the main text demonstrate successful imaging in these evaluation scenarios.
AprilTag Data Collection: Data collection for the AprilTag localization and detection task was performed in the larger tank (3 m x 4 m x 1.5 m). For this task, the camera sensor was submerged in the tank at a depth of 30 cm below the surface and placed at one side of the tank to capture images of the AprilTag. The AprilTag was submerged at the same depth. The experimental trial was repeated by placing the AprilTag at 8 different locations separated by 50 cm, up to 4 m of maximum range between the AprilTag and the camera (i.e., the edge of the enclosed tank). At each location (i.e., range), we used the camera to capture 20 images of the AprilTag, where the orientation and angle was varied with respect to the camera in each image, resulting in a total of 160 images ( Supplementary Fig. 7). To speed up the data collection process, these images were collected by connecting the FPGA output directly to a USRP N210 software radio (Ettus); this removes the bandwidth limitation of underwater acoustic communication and enables programming the FPGA to transmit captured pixels at a much higher rate (2 Mbps). Note that we did not bypass the FM0 backscatter modulation for the results shown in Fig. 4c, but only bypassed the underwater channel. In addition to this data collection, Fig. 4b of the main text shows a sample AprilTag image captured in this setup using end-to-end batteryless imaging and underwater backscatter communication (at 1 kbps).

Calibration for AprilTag Localization:
In order to determine an accurate relationship between a 3D location in the environment and its corresponding 2D pixel in the image captured by our underwater camera, we compute the 3 x 3 homography matrix that contains all the physical information (location and orientation) of the tag 3 . Computation of the matrix requires the intrinsic parameters of the camera, such as the focal length and optical center of the camera. To extract the parameters from our underwater camera, we used a checkerboard calibration method, which is standard in 3D reconstruction problems in computer vision 3 . We captured 150 images of the checkerboard (7x10 square pixels with a pixel size of 23 mm x 23 mm) from different viewpoints at 3 different distances: 50 cm, 80 cm, and 120 cm and extracted the intrinsic parameters using the Multiplane calibration algorithm 4 . This calibration process needs to be completed only once since we used the same underwater camera throughout all of the measurements.

AprilTag Detection and Localization:
After the camera is calibrated, the detection and localization tasks are performed on the captured AprilTag images in the dataset described earlier. The tasks were performed following similar procedures to prior work on AprilTag localization 5 . The detection algorithm computes the gradient of every pixel and clusters the pixels that have similar direction and magnitude into components. After performing a recursive depth-first search, it extracts the edges of the AprilTag. Using the edges, the algorithm finds four-sided regions that have a darker interior than their exterior and verifies if the region has valid tag pixels. If the pattern is valid, the detection succeeds, and the region is used as an input to the homography matrix which outputs the tag's location.

Open Water Testing Environments
Open water testing of the prototype was performed in Keyser Pond, NH and in Charles River, MA ( Supplementary Fig. 6a, Fig. 6b).
In Keyser Pond, the acoustic transmitter * , harvesting and backscatter transducers, and hydrophone were submerged half a meter below the water surface and the camera sensor was placed at a distance of 50 cm from the plastic water bottle. The image was collected at night, yet the prototype was successfully able to capture color features (as shown in Fig. 3b in the main paper) due to its active illumination method.
Long-range communication experiments were performed in the Charles River, where the acoustic projector, harvesting and backscatter transducers, and hydrophone were all submerged at a depth of 2 m below the water surface. The projector and the backscatter transducer were separated by a distance of 50 cm and the hydrophone was moved further away up to 40 m to test communication at different distances. For this experiment, the backscatter node was programmed to communicate a known pseudo-random sequence of 50 bits (10 bits of preamble with 40 bits of data) in each packet at a data rate of 1 kbps. These bits were constructed in MATLAB and were fed to the transistor switches M1 and M2 using a signal generator ( Supplementary Fig. S1). The hydrophone was connected to a USRP N210 to record the received signal for 20 seconds, resulting in 400 packets. For each distance, we recorded data at three different depths (1.5 m, 2 m, and 2.5 m), and for each depth, we computed a single value for BER and different values for SNR (one for each decoded packet). The BER value was computed over all packets by comparing the decoded 50 bits of each packet with the actual transmitted bits. SNR values were computed individually for each packet where the signal power was determined by projecting the received packet onto the transmitted packet and noise power was evaluated by subtracting the signal power from the total received power. The SNR and BER curves are shown in Fig.4e of the main text as a function of distance, where the BER curve shows the median value of BER across all three depths and the solid line for SNR represents the median SNR over 900 packets (300 packets * 3 depths). The lower and upper bound of the shaded region for the SNR curve represent the 10th and 90th percentile respectively.
In addition to testing the communication capabilities of our method, we also evaluated its harvesting performance at different ranges. An experiment was performed in the Charles River, where the acoustic projector † and the harvester node were submerged at a depth of 2 m below the water surface. The harvester node was moved further away (with an interval of 50 cm) up to 4 m. The open-circuit, rectified, harvested voltage was measured using a digital oscilloscope. For each distance, the harvester node was moved to three different depths (1.5 m, 2 m, 2.5 m) and the voltage was measured at each depth. At each depth, 3 measurements were taken, resulting in a total of 9 measurements at each range. The harvester node was also moved gradually across the entire water column for each distance to measure the maximum voltage that the harvester transducer can harvest at each distance. The plot for harvested voltage as a function of distance is shown in Fig.4d of the main text where the maximum harvested voltage is represented as the contour of the shaded region and the 9 measurements at 3 different depths are represented as dots.

Range Analysis
In battery-free backscatter communication systems, the end-to-end communication range is determined by the ability of a remote transmitter to power up the battery-free sensor 7,8 . Hence, to understand the communication range of our underwater battery-free imaging system, we analyze the downlink range between the projector and the battery-free node. Our downlink analysis follows a model introduced in recent work that studied the range of underwater acoustic backscatter communication systems 7 .
The downlink communication range of our system is determined by two constraints: (a) the harvested power and (b) the rectified voltage. In particular, the harvested power needs to exceed a minimum threshold for continuous operation, and the rectified voltage needs to exceed a minimum activation voltage required to turn on the LDO (see Energy Harvesting and Power Management in Methods). Since the harvested power and the rectified voltage are both a function of the open-circuit voltage, we first analyze the open-circuit voltage as a function of range, then relate it to the harvested voltage and power.

Open-Circuit Voltage
The voltage at the harvesting transducer is a function of the transmit source level (due to transmit power, projector efficiency, and directivity), range and pathloss (due to absorption, spreading loss, and directivity), and the properties of the harvesting transducer (efficiency, directivity and sensitivity). Specifically, the RMS open-circuit voltage (Voc) can be expressed as 7,8 : +RVS 20 where RVS is the receiving voltage sensitivity of the backscatter node's transducer, and RL is the received signal level at the transducer, which itself is a function of the transmit power (Pt), transmit efficiency (Tx), range (R), and directivity of the projector (DITx), spreading factor (k), and absorption coefficient (α) as per the following equation 8,9 : RL( , , ) = 170.8 + 10 log( ) + DI Tx − .10 log( ) − ( )

Harvested Voltage
The harvested voltage is a function of the open-circuit voltage (Voc). In particular, recall that the harvesting transducer's output (after matching) is passed through a multi-stage rectifier that converts the AC to DC voltage and passively amplifies the voltage. The harvested voltage at output of the rectifier (Vrect) is a function of the number of stages (N) and the diode threshold voltage (Vth), and can be expressed as follows 10 : In our prototype implementation, RVS = -180dB re 1V/µPa, Tx = 0.175, PTx = 25 W, DITx = 2.62dB, k = 1.5, α = 0.0022dB, N = 4, and Vth= 0.35 V.
To study the harvested voltage constraint in our battery-free imaging system, we simulate the rectified voltage as a function of range following the above model ( Supplementary Fig. 9a). The figure also plots the minimum activation voltage (dashed horizontal line), which corresponds to 3.2V in our design. We consider three optimizations for our proof-of-concept prototype, following the parameters highlighted in prior work on underwater backscatter 7 . First, we consider a design whose harvesting transducers have an RVS of -157dB re 1V/µPa (instead of -180dB re 1V/µPa), and plot the rectified voltage (in blue). Our second optimization considers a projector whose efficiency is 0.5 (instead of 0.175), and we plot the corresponding rectified voltage (in orange). Finally, we study how increasing the transmit power from 25 W to 500 W impacts the harvested voltage as a function of range (in black). The figure shows that with more optimized engineering parameters, the range of an underwater battery-free imaging system may increase to more than 300 meters, matching prior analytical model 7 .
It is worth noting that the activation voltage is also function of our system design parameters. In principle, the main limitation on the voltage is determined by the non-linearity of the harvester electronics, specifically the diodes, whose threshold voltage is 0.35V. One can approach this threshold voltage (and achieve higher ranges) by increasing the number of stages in the multi-stage rectifier as well as by using rectifiers with lower threshold voltages 11 . We plot the harvested power as a function of range following the same parameters of the above model in (Supplementary Fig. 9b). We also plot the minimum power (dashed horizontal line) required for our prototype to operate continuously. The plot demonstrates that underwater battery-free imaging may be possible at hundreds of meters under optimized engineering design parameters.

Harvested Power
It is worth noting that the harvested power can be further improved by optimizing two other design parameters. First, in addition to the parameters discussed above, it is possible to boost the AC-to-DC power conversion efficiency from 0.16 to higher realizable efficiency of 0.60 12 . Second, the end-to-end power transfer efficiency (and range) may be improved by using beamforming ‡ . In particular, past work has considered underwater acoustic beamforming and demonstrated that it can enable directivity gains of 16dB 13 . A natural question here is: how can a projector identify the optimal beamforming direction so that it may electronically steer its array accordingly? If the backscatter node's location is known a priori, then the beamsteering direction may be computed geometrically and the projector can apply the corresponding beamsteering vector. Alternatively, if the backscatter node's location (or the projector's location) is unknown, then the projector can find the correct beam by employing one of the standard beam searching algorithms 14,15 . For example, the projector can first scan different directions, by sequentially applying different beamforming vectors. When it reaches the correct direction, the backscatter node powers up and responds with stored bits. The projector uses this feedback to identify the correct direction, and continues beamforming in that direction for the remainder of the communication session. Since the transmit source level in our evaluation is already high (180dB re:1µPa), such optimized designs will be critical to achieve higher range in future work.

Timing Analysis
In this section, we analyze the timing performance of our ultra-low-power imaging platform. Specifically, we analyze the time that the system needs to harvest sufficient energy to power up and the time needed to capture and communicate one full image.

Energy Harvesting Time
Our battery-free camera sensor operates entirely on the harvested power, and the time, T needed to harvest sufficient energy to capture a gray-scale image is given by the following equation:  Table  2) and Pharv is the harvested power. The harvested power depends on the transmit power, distance from the projector, harvesting transducer's RVS, and the efficiency of the harvesting circuit. With our current design parameters (see Range Analysis), it takes around 10-12 seconds to harvest sufficient energy at 1 meter. However, recall from our discussion in Range Analysis that these parameters can be optimized to increase the harvested power which would reduce the time needed to harvest sufficient energy. Specifically, using the model parameters mentioned in Range Analysis and the equation given above, we plot the harvesting time, T as a function of distance ( Supplementary Fig. 9c). The plot shows that under optimized design parameters, the energy harvesting time is less than a second (i.e., the imaging operation starts instantaneously) even beyond 100 meters.
Recall that sending a full image typically requires multiple captures (due to the memory limitations on the FPGA), and one might wonder whether each of these captures requires the above-mentioned harvesting time. However, that is not the case, and the sensor needs the harvesting time only once during the beginning of the operation. To see why, recall that the system operates in two phases: image capture phase and backscatter communication phase. The backscatter communication phase consumes significantly less power of 59 µW (Supplementary Table 2) and lasts longer (due to the narrow bandwidth of the underwater acoustic channel). As a result, the capacitor fully recharges during this phase before it needs to enter the image capture phase again, allowing for uninterrupted operation after the initial harvesting cycle.
Finally, it is worth noting that the above analysis assumes that the system is operating in warm start (i.e., there is some pre-stored charge across the capacitor). During the cold-start phase (which occurs only once in the system's lifetime), the capacitor is fully discharged and the time required to harvest sufficient energy to initiate the operation is given by: Where C is the capacitance value (7500 µF) and Vthres is the threshold voltage (3.2 V) across the capacitor needed to initiate the operation. With our current design parameters, it takes 4-5 minutes to harvest sufficient energy at 1 meter to initiate the imaging operation. Moreover, following the same analysis discussed above, optimizing the system design parameters would allow reducing this initiation time to few seconds.

Image Framerate
The framerate of our system depends on the time needed to capture and communicate image data to a remote receiver. Specifically, the total time, T, needed for one full image is given by the following equation: = ( + To achieve higher framerate, we successfully experimented with communicating at 5 kbps (with BERs of 10 -3 at 1m). At such datarates, the time needed to capture and communicate a greyscale image reduces to ~5 mins (~14 mins for the color image). Moreover, higher framerates are achievable by leveraging past work on underwater backscatter node design which has demonstrated throughputs up to 20kbps 2 ; using such designs would further reduce the time for a grey-scale image to 1.1 mins (3.4 mins for a color image).

Cost Analysis
The total cost of fabricating and assembling our underwater batteryless imaging sensor prototype is $353.97 (Supplementary Table 3). The main components of the design are the piezoceramic transducers, camera sensor, FPGA, PCB, and housing. The prototype uses a total of six piezoceramic cylinders: two with a resonance frequency of 17 kHz and four with a resonance frequency of 30 kHz. The total cost of the piezoceramic cylinders is $231.5 (45.7*2 +35*4). The housing of the camera prototype consists of a Telesin dome port which costs $45 and a smaller acrylic dome priced at $11 to encapsulate the active illumination hardware. The IGLOO nano FPGA costs $12.72, the Himax camera sensor costs $9.95, and the total cost of PCB fabrication is $12. The low cost of fabrication of our batteryless prototype -coupled with the fact that it does not require an extensive infrastructure in the form of cabling for power and communication 16,17,18,19 makes underwater backscatter imaging a viable method for scalable underwater imaging.

Supplementary Discussion
We discuss the performance of our underwater wireless imaging method in the context of alternative methods for underwater communication.

Comparison to Low-Power Acoustic Modems
Our imaging method leverages acoustic backscatter communication to communicate image data at net-zero power. Our evaluation demonstrates that the method achieves communication ranges that are comparable to state-of-the-art low-power underwater modems, albeit at much lower power. Specifically, a state-of-the-art low-power acoustic modem 20 requires 80 milli-Watts to transmit data at 1 kbps over 100m, while our prototype consumes 59 microwatts to transmit data at the same rate over 40m (see Fig. 4e in Main, and see Backscatter Communication Phase in Table  1 of Supplementary). Our analysis demonstrates that higher ranges are realizable with more optimized transducers (see Range Analysis in Supplementary Information).
One might wonder whether prior low-power modems could be operated entirely based on harvested acoustic energy and used for net-zero-power underwater imaging. To answer this question, we consider the amount of time needed to harvest sufficient energy to transmit an image using a state-of-the-art low-power modem. Since the modem operates at the same data rate as our backscatter prototype, it would require the same amount of time for image transmission (1362.1seconds, see Supplementary Table 2) to capture and transmit a grayscale image. Multiplying this by the communication power (80mW) results in 106.07 Joules, which is 594x higher than our backscatter-based wireless platform. If one were to harvest this energy from an acoustic source (which can typically provide a few hundreds of microwatts, see Range Analysis in Supplemental Information), it would take 4-6 days to harvest sufficient energy before initiating an imaging operation (in comparison to our power-up time of 10-12 seconds). Thus, it would be impractical to design an underwater battery-free wireless imaging system leveraging prior lowpower underwater acoustic modems. Figures   Fig. 1: Schematic of the hardware design. The harvester node at the bottom is connected to a multi-stage rectifier followed by a supercapacitor, which stores the harvested energy. The supercapacitor voltage is fed to a 2.8V LDO and to a 1.4V DC/DC step-down converter. The output of the DC/DC converter is used to power the FPGA core, and the output of the LDO is used to power the FPGA banks. The FPGA is also connected to two external clocks (32kHz and 4MHz) and to the camera via several GPIO pins (pixel clock, line valid, data, power, master clock). The FPGA controls the operation of the MOSFETs connected to the communication transducer on the top left. This transducer is responsible for sending camera data via backscatter communication.

Fig. 2: Demodulation and decoding pipeline.
The signal received by the hydrophone is passed through a band-pass filter, then downconverted and passed through a low-pass filter to remove noise. This signal is then passed through a high-pass filter to remove the signal variations caused by low-frequency surface waves. The demodulated and filtered signal is fed to a maximum likelihood decoder.

Fig. 3: Packetization of pixel data.
The image captured by the CMOS image sensor is divided into 53 segments. Each image segment is divided into 250 packets, where each packet contains data for 6 pixels. The uplink packet structure includes a 16-bit preamble, followed by a 12-bit long packet number, and a payload of 48 bits. A parity bit is appended to each packet; it is set to 1 if the sum of bits in the payload is even and is set to 0 otherwise.  The camera PCB contains the Himax image sensor, supercapacitor for harvesting energy, power management electronics, and an FPGA for processing and memory. It also contains programming pins to program the FPGA and change camera parameters. The PCB is enclosed in a transparent dome, and the entire structure is tightly screwed to make it water-proof.